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Abstract 

The problem of error correction in both coherent and noncoherent network coding is considered 
under an adversarial model. For coherent network coding, where knowledge of the network topology 
and network code is assumed at the source and destination nodes, the error correction capability of an 
(outer) code is succinctly described by the rank metric; as a consequence, it is shown that universal 
network error correcting codes achieving the Singleton bound can be easily constructed and efficiently 
decoded. For noncoherent network coding, where knowledge of the network topology and network code 
is not assumed, the error correction capability of a (subspace) code is given exactly by a new metric, 
called the injection metric, which is closely related to, but different than, the subspace metric of Kotter 
and Kschischang. In particular, in the case of a non-constant-dimension code, the decoder associated 
with the injection metric is shown to correct more errors then a minimum-subspace-distance decoder All 
of these results are based on a general approach to adversarial error correction, which could be useful 
for other adversarial channels beyond network coding. 

Index Terms 

Adversarial channels, eiTor correction, injection distance, network coding, rank distance, subspace 
codes. 

I. Introduction 

The problem of error correction for a network implementing linear network coding has been an active 
research area since 2002 [1]-[12]. The crucial motivation for the problem is the phenomenon of error 
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propagation, which arises due to the recombination characteristic at the heart of network coding. A single 
corrupt packet occurring in the application layer (e.g., introduced by a malicious user) may proceed 
undetected and contaminate other packets, causing potentially drastic consequences and essentially ruling 
out classical error correction approaches. 

In the basic multicast model for linear network coding, a source node transmits n packets, each 
consisting of m symbols from a finite field F^. Each link in the network transports a packet free of 
errors, and each node creates outgoing packets as Fg-linear combinations of incoming packets. There are 
one or more destination nodes that wish to obtain the original source packets. At a specific destination 
node, the received packets may be represented as the rows of an x m matrix Y = AX, where X is 
the matrix whose rows are the source packets and A is the transfer matrix of the network. Errors are 
incorporated in the model by allowing up to t error packets to be added (in the vector space F^) to the 
packets sent over one or more links. The received matrix y at a specific destination node may then be 
written as 

Y = AX + DZ (1) 

where Z is a. t x m matrix whose rows are the error packets, and D is the transfer matrix from these 
packets to the destination. Under this model, a coding-theoretic problem is how to design an outer code 
and the underlying network code such that reliable communication (to all destinations) is possible. 

This coding problem can be posed in a number of ways depending on the set of assumptions made. 
For example, we may assume that the network topology and the network code are known at the source 
and at the destination nodes, in which case we call the system coherent network coding. Alternatively, 
we may assume that such information is unavailable, in which case we call the system noncoherent 
network coding. The error matrix Z may be random or chosen by an adversary, and there may be further 
assumptions on the knowledge or other capabilities of the adversary. The essential assumption, in order 
to pose a meaningful coding problem, is that the number of injected error packets, t, is bounded. 

Error correction for coherent network coding was originally studied by Cai and Yeung [l]-[3]. Aiming 
to establish fundamental limits, they focused on the fundamental case m = 1. In [2], [3] (see also [9], 
[10]), the authors derive a Singleton bound in this context and construct codes that achieve this bound. 
A drawback of their approach is that the field size required can be very large (on the order of ('gj). 
where \£\ is the number of edges in the network), and no efficient decoding method is given. Similar 
constructions, analyses and bounds appear also in [4], [5], [11], [12]. 

In Section |IVj we approach this problem (for general m) under a different framework. We assume 
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the pessimistic situation in which the adversary can not only inject up to t packets but can also freely 
choose the matrix D. In this scenario, it is essential to exploit the structure of the problem when m > 1. 
The proposed approach allows us to find a metric — the rank metric — that succinctly describes the error 
correction capability of a code. We quite easily obtain bounds and constructions analogous to those of 
[2], [3], [9]-[ll], and show that many of the results in [4], [12] can be reinterpreted and simplified in this 
framework. Moreover, we find that our pessimistic assumption actually incurs no penalty since the codes 
we propose achieve the Singleton bound of [2]. An advantage of this approach is that it is universal, in 
the sense that the outer code and the network code may be designed independently of each other. More 
precisely, the outer code may be chosen as any rank-metric code with a good error-correction capability, 
while the network code can be designed as if the network were error-free (and, in particular, the field 
size can be chosen as the minimum required for multicast). An additional advantage is that encoding and 
decoding of properly chosen rank-metric codes can be performed very efficiently [8]. 

For noncoherent network coding, a combinatorial framework for error control was introduced by Kotter 
and Kschischang in [7]. There, the problem is formulated as the transmission of subspaces through an 
operator channel, where the transmitted and received subspaces are the row spaces of the matrices X and 
Y in ([T]), respectively. They proposed a metric that is suitable for this channel, the so-called subspace 
distance [7]. They also presented a Singleton-like bound for their metric and subspace codes achieving 
this bound. The main justification for their metric is the fact that a minimum subspace distance decoder 
seems to be the necessary and sufficient tool for optimally decoding the disturbances imposed by the 
operator channel. However, when these disturbances are translated to more concrete terms such as the 
number of error packets injected, only decoding guarantees can be obtained for the minimum distance 
decoder of [7], but no converse. More precisely, assume that t error packets are injected and a general 
(not necessarily constant-dimension) subspace code with minimum subspace distance d is used. In this 
case, while it is possible to guarantee successful decoding if t < d/A, and we know of specific examples 
where decoding fails if this condition is not met, a general converse is not known. 

In Section |Vl we prove such a converse for a new metric — called the injection distance — under a 
slightly different transmission model. We assume that the adversary is allowed to arbitrarily select the 
matrices A and D, provided that a lower bound on the rank of A is respected. Under this pessimistic 
scenario, we show that the injection distance is the fundamental parameter behind the error correction 
capability of a code; that is, we can guarantee correction of t packet errors if and only if t is less than half 
the minimum injection distance of the code. While this approach may seem too pessimistic, we provide 
a class of examples where a minimum-injection-distance decoder is able to correct more errors than a 



4 



minimum-subspace-distance decoder. Moreover, the two approaches coincide when a constant-dimension 
code is used. 

In order to give a unified treatment of both coherent and noncoherent network coding, we first develop 
a general approach to error correction over (certain) adversarial channels. Our treatment generalizes the 
more abstract portions of classical coding theory and has the main feature of mathematical simplicity. The 
essence of our approach is to use a single function — called a discrepancy function — to fully describe an 
adversarial channel. We then propose a distance-like function that is easy to handle analytically and (in 
many cases, including all the channels considered in this paper) precisely describes the error correction 
capability of a code. The motivation for this approach is that, once such a distance function is found, 
one can virtually forget about the channel model and fully concentrate on the combinatorial problem 
of finding the largest code with a specified minimum distance (just like in classical coding theory). 
Interestingly, our approach is also useful to characterize the error detection capability of a code. 

The remainder of the paper is organized as follows. Section JI] establishes our notation and review 
some basic facts about matrices and rank-metric codes. Section IIII-AI presents our general approach 
to adversarial error correction, which is subsequently specialized to coherent and noncoherent network 
coding models. Section |IV] describes our main results for coherent network coding and discusses their 
relationship with the work of Yeung et al. [2]-[4]. Section |V] describes our main results for noncoherent 
network coding and discusses their relationship with the work of Kotter and Kschischang [7]. Section IVll 
presents our conclusions. 

II. Preliminaries 

A. Basic Notation 

Define N = {0, 1, 2, . . .} and = max{x, 0}. The following notation is used many times throughout 
the paper. Let A' be a set, and let C <^ X. Whenever a function d: ^ x A" — > N is defined, denote 

d{C) = mill d{x, x'). 

x,x'£C: xj^x' 

If d{x, x') is called a "distance" between x and x' , then d{C) is called the minimum "distance" of C. 

B. Matrices and Subspaces 

Let ¥q denote the finite field with q elements. We use F^^™ to denote the set of all n x m matrices 
over and use Vq{m) to denote the set of all subspaces of the vector space F^. 
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Let dim V denote the dimension of a vector space V, let {X) denote the row space of a matrix X, 
and let wt(X) denote the number of nonzero rows of X. Recall that dim {X) = rank X < wt(X). 
Let U and V be subspaces of some fixed vector space. Recall that the sum U + V = {u + v: 

U, V ^ V} is the smallest vector space that contains both U and V, while the intersection n V is the 
largest vector space that is contained in both U and V. Recall also that 



dim {U + V) = dim W + dim V - dim {U n V). 



The rank of a matrix X £ F.'; 



is the smallest r for which there exist matrices P & ¥" 



(2) 



and 



y ^.x^^^^w^. „...w.. ...w^w w...^. .wv.^ ^ ^ 

Q G F^^'" such that X = PQ. Note that both matrices obtained in the decomposition are full-rank; 
accordingly, such a decomposition is called a full-rank decomposition [13]. In this case, note that, by 
partitioning P and Q, the matrix X can be further expanded as 

'q'' 



X = PQ 



P' P" 



Q" 



P'Q' + P"Q" 



where rank (P'Q') + rank (P"Q") = r. 

Another useful property of the rank function is that, for X € F^^™ and A G F^^", we have [13] 



rank A + rank X — n < rank AX < minjrank A, rank X}. 



(3) 



C. Rank-Metric Codes 
Let X, y G Fg ^"^ be matrices. The rank distance between X and Y is defined as 

dR(X,y)^rank(y-X). 

It is well known that the rank distance is indeed a metric; in particular, it satisfies the triangle inequahty 
[13], [14]. 

A rank-metric code is a matrix code C C ^"^ used in the context of the rank metric. The Singleton 
bound for the rank metric [14] (see also [8]) states that every rank-metric code C C F^^"* with minimum 
rank distance d^{C) = d must satisfy 

^ ^max{n,m}(min{n,m}— d+l) ^^-^ 

Codes that achieve this bound are called maximum-rank-distance (MRD) codes and they are known to 
exist for all choices of parameters q, n, m and d < min{n, m} [14]. 
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III. A General Approach to Adversarial Error Correction 

This section presents a general approach to error correction over adversarial channels. This approach 
is specialized to coherent and noncoherent network coding in sections |IV] and |Vl respectively. 

A. Adversarial Channels 

An adversarial channel is specified by a finite input alphabet X, a finite output alphabet y and a 
collection oi fan-out sets yx Qy for all x ^ X. For each input x, the output y is constrained to be in yx 
but is otherwise arbitrarily chosen by an adversary. The constraint on the output is important: otherwise, 
the adversary could prevent communication simply by mapping all inputs to the same output. No further 
restrictions are imposed on the adversary; in particular, the adversary is potentially omniscient and has 
unlimited computational power. 

A code for an adversarial channel is a subsej^ C ^ X. We say that a code is unambiguous for a channel 
if the input codeword can always be uniquely determined from the channel output. More precisely, a 
code C is unambiguous if the sets yx, x € C, are pairwise disjoint. The importance of this concept lies in 
the fact that, if the code is not unambiguous, then there exist codewords x, x' that are indistinguishable 
at the decoder: if yx H yx' 7^ 0, then the adversary can (and will) exploit this ambiguity by mapping 
both X and x' to the same output. 

A decoder for a code C is any function x: y ^ CiJ {/}, where f ^ C denotes a decoding failure 
(detected error). When x G C is transmitted and y G yx is received, a decoder is said to be successful 
if x{y) = X. We say that a decoder is infallible if it is successful for all y € yx and all x € C. Note 
that the existence of an infallible decoder for C implies that C is unambiguous. Conversely, given any 
unambiguous code C, one can always find (by definition) a decoder that is infallible. One example is the 
exhaustive decoder 

X if y G yx and y ^ yx' for all x' ^C, x' = x 
f otherwise. 

In other words, an exhaustive decoder returns x if x is the unique codeword that could possibly have 
been transmitted when y is received, and returns a failure otherwise. 

Ideally, one would like to find a large (or largest) code that is unambiguous for a given adversarial 
channel, together with a decoder that is infallible (and computationally-efficient to implement). 



x{y) 



'There is no loss of generality in considering a single channel use, since the channel may be taken to correspond to multiple 
uses of a simpler channel. 
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B. Discrepancy 

It is useful to consider adversarial channels parameterized by an adversarial effort t G N. Assume that 
the fan-out sets are of the form 



for some A : A" x 3^ — > N. The value A{x, y), which we call the discrepancy between x and y, represents 
the minimum effort needed for an adversary to transform an input x into an output y. The value of t 
represents the maximum adversarial effort (maximum discrepancy) allowed in the channel. 

In principle, there is no loss of generality in assuming ^ since, by properly defining A(x, y), one can 
always express any in this form. For instance, one could set A(x, y) = if y € 3^^, and A(x, y) = oo 
otherwise. However, such a definition would be of no practical value since A(x,y) would be merely an 
indicator function. Thus, an effective limitation of our model is that it requires channels that are naturally 
characterized by some discrepancy function. In particular, one should be able to interpret the maximum 
discrepancy t as the level of "degradedness" of the channel. 

On the other hand, the assumption A(x,y) E N imposes effectively no constraint. Since jA:" x is 
finite, given any "naturally defined" A' : ^ x 3^ ^ M, one can always shift, scale and round the image 
of A' in order to produce some A : A' x J' — > N that induces the same fan-out sets as A' for all t. 

Example 1: Let us use the above notation to define a t-error channel, i.e., a vector channel that 
introduces at most t symbol errors (arbitrarily chosen by an adversary). Assume that the channel input 
and output alphabets are given by ^ = 3^ = F^. It is easy to see that the channel can be characterized 
by a discrepancy function that counts the number of components in which an input vector x and an 
output vector y differ. More precisely, we have A(x, y) = d}i{x, y), where du{-, •) denotes the Hamming 
distance function. ■ 

A main feature of our proposed discrepancy characterization is to allow us to study a whole family 
of channels (with various levels of degradedness) under the same framework. For instance, we can use 
a single decoder for all channels in the same family. Define the minimum-discrepancy decoder given by 



where any ties in Q are assumed to be broken arbitrarily. It is easy to see that a minimum-discrepancy 
decoder is infallible provided that the code is unambiguous. Thus, we can safely restrict attention to a 
minimum-discrepancy decoder, regardless of the maximum discrepancy t in the channel. 



y^ = {yey: A{x,y)<t} 



(5) 



X = argmin A{x,y) 

x&C 



(6) 
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C. Correction Capability 

Given a fixed family of channels — specified by X, y and A(-, •), and parameterized by a maximum 
discrepancy t — we wish to identify the largest (worst) channel parameter for which we can guarantee 
successful decoding. We say that a code is t-discrepancy -correcting if it is unambiguous for a channel 
with maximum discrepancy t. The discrepancy-correction capability of a code C is the largest t for which 
C is t-discrepancy-correcting. 

We start by giving a general characterization of the discrepancy-correction capability. Let the function 
t: X X X ^^ht given by 

t{x,x) = min max{A(j;, y), A(x',y)} — 1. (7) 

y&y 

We have the following result. 

Proposition 1: The discrepancy-correction capability of a code C is given exactly by r(C). In other 
words, C is t-discrepancy-correcting if and only if i < t(C). 

Proof: Suppose that the code is not t-discrepancy-correcting, i.e., that there exist some distinct 
x,x' € C and some y ^ y such that A(x,y) < t and A{x',y) < t. Then r(C) < t{x,x') < 
max{A(x, y), A.{x' , y)} — l<i — l<t. In other words, r(C) > t implies that the code is t-discrepancy- 
correcting. 

Conversely, suppose that r(C) < t, i.e., r(C) < t — l. Then there exist some distinct x,x' £ C such that 
r(x, x') <t — l. This in turn implies that there exists some y £ y such that max{A(a;, y), A(x', y)} < t. 
Since this implies that both A{x,y) < t and A(x',y) < t, it follows that the code is not t-discrepancy- 
correcting. ■ 

At this point, it is tempting to define a "distance-like" function given by 2{t{x,x') + 1), since this 
would enable us to immediately obtain results analogous to those of classical coding theory (such as 
the error correction capability being half the minimum distance of the code). This approach has indeed 
been taken in previous works, such as [12]. Note, however, that the terminology "distance" suggests a 
geometrical interpretation, which is not immediately clear from (O. Moreover, the function Q is not 
necessarily mathematically tractable. It is the objective of this section to propose a "distance" function 
6: X X X that is motivated by geometrical considerations and is easier to handle analytically, yet is 
useful to characterize the correction capability of a code. In particular, we shall be able to obtain the same 
results as [12] with much greater mathematical simplicity — which will later turn out to be instrumental 
for code design. 
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For X, x' G X, define the A-distance between x and x' as 

6(x,x') = inm\A(x,y) + A(x',y)] . (8) 

The following interpretation holds. Consider the complete bipartite graph with vertex sets X and y, and 
assume that each edge {x,y) G X x y is labeled by a "length" A{x,y). Then 6{x,x') is the length of 
the shortest path between vertices x, x' G X. Roughly speaking, b{x^ x') gives the minimum total effort 
that an adversary would have to spend (in independent channel realizations) in order to make x and x' 
both plausible explanations for some received output. 

Example 2: Let us compute the A-distance for the channel of Example [T] We have 5{x, x') = 
vainy {dB_{x,y) + (iH(x',y)} > d}i{x,x'), since the Hamming distance satisfies the triangle inequality. 
This bound is achievable by taking, for instance, y = x' . Thus, 5{x,x') = dY{{x,x'), i.e., the A-distance 
for this channel is given precisely by the Hamming distance. ■ 

The following result justifies our definition of the A-distance. 

Proposition 2: For any code C, r(C) > [(5(C) - l)/2j. 

Proof: This follows from the fact that [(a + 6 + l)/2j < max{a, 6} for all a,b ^ Z. ■ 

Proposition |2] shows that 6{C) gives a lower bound on the correction capability of a code — therefore 
providing a connection guarantee. The converse result, however, is not necessarily true in general. Thus, 
up to this point, the proposed function is only partially useful: it is conceivable that the A-distance 
might be too conservative and give a guaranteed correction capability that is lower than the actual one. 
Nevertheless, it is easier to deal with addition, as in ([H), rather than maximization, as in 

A special case where the converse is true is for a family of channels whose discrepancy function 
satisfies the following condition: 

Definition 1: A discrepancy function A: A' x ^ — > N is said to be normal if, for all x,x' ^ X and 
all < i < 6{x, x'), there exists some y ^y such that A{x, y) = i and A(x', y) = 5{x, x') — i. 

Theorem 3: Suppose that A(-, •) is normal. For every code C C ^, we have t(C) = [{6(C) — l)/2j. 
Proof: We just need to show that 1{6{C) - l)/2j > t(C). Take any x,x' G X. Since A(-,-) is 
normal, there exists some y £ y such that A(x, y) + A{x', y) = 6{x, x') and either A(x, y) = A(x', y) or 
A(x, y) = A(x', y) — 1. Thus, 5{x, x') > 2 max{A(x, y), A{x', y)} — 1 and therefore [{6{x, x') — 1) /2j > 
r(x,x'). ■ 

Theorem |3] shows that, for certain families of channels, our proposed A-distance achieves the goal 
of this section: it is a (seemingly) tractable function that precisely describes the correction capability of 
a code. In particular, the basic result of classical coding theory — that the Hamming distance precisely 
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describes the error correction capability of a code — follows from the fact that the Hamming distance (as 
a discrepancy function) is normal. As we shall see, much of our effort in the next sections reduces to 
showing that a specified discrepancy function is normal. 

Note that, for normal discrepancy functions, we actually have t{x,x') = l{6{x,x') — l)/2j, so 
Theorem [3] may also be regarded as providing an alternative (and more tractable) expression for t{x,x'). 

Example 3: To give a nontrivial example, let us consider a binary vector channel that introduces at 
most p erasures (arbitrarily chosen by an adversary). The input alphabet is given hy X = {0, 1}", 
while the output alphabet is given by ^ = {0,1, e}", where e denotes an erasure. We may define 
A(x,y) = Ya=i ^{xi,yi), where 

if = Xi 
I^{xi,yi) = {i if y. = e . 

oo otherwise 

The fan-out sets are then given hy = {y ^ y A(x,y) < p}. In order to compute 5{x,x'), observe 
the minimization in ([8]l. It is easy to see that we should choose yi = xi when Xi = x[, and yi = e when 
Xi 7^ It follows that 6{x,x') = 2d}i{x,x'). Note that A(x,y) is normal. It follows from Theorem [3] 
that a code C can correct all the p erasures introduced by the channel if and only if 2d}i{C) > 2p. This 
result precisely matches the well-known result of classical coding theory. ■ 
It is worth clarifying that, while we call •) a "distance," this function may not necessarily be a 
metric. While symmetry and non-negativity follow from the definition, a A-distance may not always 
satisfy y) = <^=^> x = y" or the triangle inequality. Nevertheless, we keep the terminology for 
convenience. 

Although this is not our main interest in this paper, it is worth pointing out that the framework of 
this section is also useful for obtaining results on error detection. Namely, the A-distance gives, in 
general, a lower bound on the discrepancy detection capability of a code under a bounded discrepancy- 
correcting decoder; when the discrepancy function is normal, then the A-distance precisely characterizes 
this detection capability (similarly as in classical coding theory). For more details on this topic, see 
Appendix [A] 
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IV. Coherent Network Coding 

A. A Worst-Case Model and the Rank Metric 

The basic channel model for coherent network coding with adversarial errors is a matrix channel with 
input X € F^^™, output Y e F^^™, and channel law given by ©, where A G F^^" is fixed and 
known to the receiver, and Z G F*><™ is arbitrarily chosen by an adversary. Here, we make the following 
additional assumptions: 

• The adversary has unlimited computational power and is omniscient; in particular, the adversary 
knows both A and X; 

• The matrix D G F^^* is arbitrarily chosen by the adversary. 

We also assume that t < n (more precisely, we should assume t < rank A); otherwise, the adversary 
may always choose DZ = —AX, leading to a trivial communications scenario. 

The first assumption above allows us to use the approach of Section |llll The second assumption may 
seem somewhat "pessimistic," but it has the analytical advantage of eliminating from the problem any 
further dependence on the network code. (Recall that, in principle, D would be determined by the network 
code and the choice of links in error.) 

The power of the approach of Section |lll] lies in the fact that the channel model defined above can be 
completely described by the following discrepancy function 

Aa{X,Y)= min r. (9) 

Y=AX+DZ 

The discrepancy /S.a{X,Y) represents the minimum number of error packets that the adversary needs 
to inject in order to transform an input X into an output Y , given that the transfer matrix is A. The 
subscript in Aa(^, y) is to emphasize the dependence on A. For this discrepancy function, the minimum- 
discrepancy decoder becomes 

X = argmin Aa(X, Y). (10) 

xec 

Similarly, the A-distance induced by /S.a{X,Y) is given by 

6a{X,X')^ min { Aa(X, y) + Aa(X', F)} (11) 

yeFf x" 

for X,X' G F^^"^. 

We now wish to find a simpler expression for A^(X, y) and 5a{X,X'), and show that A^(X, y) is 
normal. 
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Lemma 4: 

AA(X,y) = rank(y- AX). (12) 

Proof: Consider /S.a{X, Y) as given by For any feasible triple (r, D, Z), we have r > rank Z > 
rank DZ = rank(y — AX). This bound is achievable by setting r = rank(y — AX) and letting DZ be 
a full-rank decomposition of y — AX. ■ 
Lemma 5: 

6a{X, X') = dB.{AX, AX') = rank A{X' - X). 

Proof: From ^ and Lemma H we have 5a{X,X') = miny {dB,{Y,AX) + dn{Y,AX')}. Since 
the rank metric satisfies the triangle inequality, we have dYi{AX,Y) + dYi{AX' ,Y) > d^iiAX, AX'). 
This lower bound can be achieved by choosing, e.g., Y = AX. ■ 

Note that 6a{-, •) is a metric if and only if A has full column rank — in which case it is precisely the 
rank metric. (If rank A < n, then there exist X ^ X' such that 6a{X,X') = 0.) 

Theorem 6: The discrepancy function Aa{-, •) is normal. 

Proof: Let X,X' G F^^™ and let < i < d = 6a{X,X'). Then rank A{X' - X) = d. By 
performing a full-rank decomposition of A{X' — X), we can always find two matrices W and W' such 
that W + W = A{X' - X), rank W = i and rank W' = d-i. Taking Y = AX + W = AX' - W, we 
have that Aa(X, Y) = i and Aa(X', y) = d - i. ■ 

Note that, under the discrepancy /S.a{X, Y), a f-discrepancy-correcting code is a code that can correct 
any t packet errors injected by the adversary. Using Theorem [6] and Theorem [3l we have the following 
result. 

Theorem 7: A code C is guaranteed to correct any t packet errors if and only if 5a{C) > 2t. 

Theorem |7] shows that 6a{C) is indeed a fundamental parameter characterizing the error correction 
capability of a code in our model. Note that, if the condition of Theorem |7] is violated, then there exists 
at least one codeword for which the adversary can certainly induce a decoding failure. 

Note that the error correction capability of a code C is dependent on the network code through the 
matrix A. Let p = n — rank A be the column-rank deficiency of A. Since 6a{X, X') = rank A{X' — X), 
it follows from ^ that 

dniX, X')-p< 6a{X, X') < dniX, X') 

and 

dR{C)-p<6A{C)<dK{C). (13) 
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Thus, the error correction capabiUty of a code is strongly tied to its minimum rank distance; in particular, 
5yi(C) = dR{C) if p = 0. While the lower bound 5yi(C) > (C) —p may not be tight in general, we should 
expect it to be tight when C is sufficiently large. This is indeed the case for MRD codes, as discussed 
in Section IIV-CI Thus, a rank deficiency of A will typically reduce the error correction capability of a 
code. 

Taking into account the worst case, we can use Theorem |7] to give a correction guarantee in terms of 
the minimum rank distance of the code. 

Proposition 8: A code C is guaranteed to correct t packet errors, under rank deficiency p, if dji{C) > 
2t + p. 

Note that the guarantee of Proposition [8] depends only on p and t; in particular, it is independent of 
the network code or the specific transfer matrix A. 

B. Reinterpreting the Model of Yeung et al. 

In this subsection, we investigate the model for coherent network coding studied by Yeung et al. in 
[l]-[4], which is similar to the one considered in the previous subsection. The model is that of a matrix 
channel with input X G F^^™, output Y G and channel law given by 

Y = AX + FE (14) 

where A G F^^" and F G Fg are fixed and known to the receiver, and E G Fg^'^"^ is arbitrarily 
chosen by an adversary provided \Nt{E) < t. (Recall that \£\ is the number of edges in the network.) In 
addition, the adversary has unlimited computational power and is omniscient, knowing, in particular. A, 
F and X. 

We now show that some of the concepts defined in [4], such as "network Hamming distance," can be 
reinterpreted in the framework of Section JII] As a consequence, we can easily recover the results of [4] 
on error correction and detection guarantees. 

First, note that the current model can be completely described by the following discrepancy function 

Aaf{X,Y)= min wt(E). (15) 

Y=AX+FE 
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The A-distance induced by this discrepancy function is given by 

5a,f(^i,^2) = min /^aA^uY) + 1^aA^2,Y) 

= min {m{Er)+^Nt{E2)} 

Y=AXi+FEi 
Y=AX2+FE2 

= min {wt{Ei) +\Nt{E2)} 

A{X2-Xij=F(Ei-E2) 

= min wt (E) 

E: ^ ' 

A{X2-Xi)=FE 

where the last equality follows from the fact that wt(£'i — £'2) < wt(£'i) + wt(£'2)» achievable if Ei = 0. 

Let us now examine some of the concepts defined in [4]. For a specific sink node, the decoder proposed 
in [4, Eq. (2)] has the form 

X = argmin *a f{X, Y). 
x&c 

The definition of the objective function '^a,f{X,Y) requires several other definitions presented in 
[4]. Specifically, ^a^f{X,Y) = D'''"'{AX,Y), where L»''<^^(Fi, F2) = W''''%Y2 - Yi), W^^Y) = 
min£;gx{y) wt{E), and T{Y) = {E:Y = EE}. Substituting all these values into 'i/A,F{X,Y), we 
obtain 

^AA^^Y) = D''^iAX,Y) 

= W^^iY - AX) 

= min wt (E) 

EGT(y-AX) 

= min wt (E) 
E: Y-AX=FE 

= /^aAx,y). 

Thus, the decoder in [4] is precisely a minimum-discrepancy decoder. 

In [4], the "network Hamming distance" between two messages Xi and X2 is defined as X2) = 

W"*^»(X2 - X]), where W^^3(^X) = W'^^AX). Again, simply substituting the corresponding defini- 
tions yields 

D"''y{Xi,X2) = W^'^{X2 - Xi) 

= W'^\A{X2 - Xi)) 

= min wt (E\ 

EeT(A(X2-Xi)) 
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= min wt (E) 

E: A{X2-Xi)=FE 
= SA,FiXi,X2). 

Thus, the "network Hamming distance" is precisely the A-distance induced by the discrepancy function 
Aa,f{X,Y). Finally, the "unicast minimum distance" of a network code with message set C [4] is 
precisely 5a,f{C)- 

Let us return to the problem of characterizing the correction capability of a code. 

Proposition 9: The discrepancy function Aa,f('5 •) is normal. 

Proof: Let Xi,X2 G F^^'™ and let < i < d = 6a,f{Xi,X2). Let E e be a solution 

to the minimization in ([TS] ). Then A{X2 — Xi) = EE and wt{E) = d. By partitioning E, we can 
always find two matrices Ei and £'2 such that Ei + E2 = E, rank Ei = i and rank E2 = d — i. Taking 
Y = AXi + EEi = AX2 - EE2, we have that Aa,f{Xi,Y) < i and Aa,f{X2,Y) < d - i. Since 
d < Aa,f{Xi,Y) + Aa,f{X2, Y), it follows that Aa,f{Xi,Y) = i and Aa,f{X2, Y) = d - i. ■ 

It follows that a code C is guaranteed to correct any t packet errors if and only if 5a,f{C) > 2t. Thus, 
we recover theorems 2 and 3 in [4] (for error detection, see Appendix [A]). The analogous results for the 
multicast case can be obtained in a straightforward manner. 

We now wish to compare the parameters devised in this subsection with those of Section IIV-AI From 
the descriptions of ([T]) and (fT4l ). it is intuitive that the model of this subsection should be equivalent to 
that of the previous subsection if the matrix F, rather than fixed and known to the receiver, is arbitrarily 
and secretly chosen by the adversary. A formal proof of this fact is given in the following proposition. 

Proposition 10: 

Aa{X,Y)= min AA,FiX,Y) 
6a(X,X')= min 6AFiX,X') 
5a{C) = min (5a,f(C). 

Proof: Consider the minimization 

min AAFiX,Y)= min wt(E). 

Y=AX+FE 

For any feasible (F, E), we have wt(-E) > rank E > rank EE = rank(y — AX). This lower bound can 
be achieved by taking 

E' 

F' and E 

' 
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where F' E' is a full-rank decomposition oiY — AX. This proves the first statement. The second statement 
follows from the first by noticing that 5a{X,X') = Aa(X,AX') and 5a,f{X,X') = Aa,f(^,^^')- 
The third statement is immediate. ■ 
Proposition [TO] shows that the model of Section IIV-AI is indeed more pessimistic, as the adversary has 
additional power to choose the worst possible F. It follows that any code that is t-error-correcting for 
that model must also be t-error-correcting for the model of Yeung et al. 



C. Optimality of MRD Codes 

Let us now evaluate the performance of an MRD code under the models of the two previous subsections. 
The Singleton bound of [2] (see also [9]) states that 

\C\ < Q"-P-'5a.f(C)+1 (16) 

where Q is the size of the alphabej^ from which packets are drawn. Note that Q = m our setting, 
since each packet consists of m symbols from Fg. Using Proposition [TOl we can also obtain 

On the other hand, the size of an MRD code, for m > n, is given by 

Id = g™{n-rfR(C)+l) (18) 

> gm{n-p-5A(C)+l) (IQ) 

> (^™("-P-'5a,f(C)+1) 

where ([T9l ) follows from ([T3] ). Since Q = q^, both ([T6l ) and (ITtI ) are achieved in this case. Thus, we 
have the following result. 

Theorem 11: When m > n, an MRD code C C F^^™ achieves maximum cardinality with respect to 
both 6a and 6a,f- 

Theorem [TT] shows that, if an alphabet of size Q = q^ > q^ is allowed (i.e., a packet size of at least 
n log2 q bits), then MRD codes turn out to be optimal under both models of sections IIV-AI and IIV-BI 

Remark: It is straightforward to extend the results of Section |TV-A| for the case of multiple heteroge- 
neous receivers, where each receiver u experiences a rank deficiency p^'^\ In this case, it can be shown 
that an MRD code with m > n achieves the refined Singleton bound of [9]. 



^This alphabet is usually assumed a finite field, but, for the Singleton bound of [2], it is sufficient to assume an abelian group, 
e.g., a vector space over ¥q. 
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Note that, due to (O, ^ and (O, it follows that (5a(C) = dR(C) -p for an MRD code with m > n. 
Thus, in this case, we can restate Theorem |7] in terms of the minimum rank distance of the code. 

Theorem 12: An MRD code C C jp^x™ with m > n is guaranteed to correct t packet errors, under 
rank deficiency p, if and only if (iR(C) > 2t + p. 

Observe that Theorem [12] holds regardless of the specific transfer matrix A, depending only on its 
column-rank deficiency p. 

The results of this section imply that, when designing a linear network code, we may focus solely 
on the objective of making the network code feasible, i.e., maximizing rank ^. If an error correction 
guarantee is desired, then an outer code can be applied end-to-end without requiring any modifications on 
(or even knowledge of) the underlying network code. The design of the outer code is essentially trivial, 
as any MRD code can be used, with the only requirement that the number of Fg-symbols per packet, m, 
is at least n. 

Remark: Consider the decoding rule (ITOl ). The fact that (fTOb together with ([T2l ) is equivalent to [8, 
Eq. (20)] implies that the decoding problem can be solved by exactly the same rank-metric techniques 
proposed in [8]. In particular, for certain MRD codes with m > n and minimum rank distance d, there 
exist efficient encoding and decoding algorithms both requiring 0{dn'^m) operations in Fg per codeword. 
For more details, see [15]. 

V. Noncoherent Network Coding 
A. A Worst-Case Model and the Injection Metric 

Our model for noncoherent network coding with adversarial errors differs from its coherent counterpart 
of Section IIV-AI only with respect to the transfer matrix A. Namely, the matrix A is unknown to the 
receiver and is freely chosen by the adversary while respecting the constraint rank A > n — p. The 
parameter p, the maximum column rank deficiency of A, is a parameter of the system that is known to 
all. Note that, as discussed above for the matrix D, the assumption that A is chosen by the adversary 
is what provides the conservative (worst-case) nature of the model. The constraint on the rank of A is 
required for a meaningful coding problem; otherwise, the adversary could prevent communication by 
simply choosing A = 0. 

As before, we assume a minimum-discrepancy decoder 

X = argmin Ap(X, Y) (20) 
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with discrepancy function given by 

AJX,Y)= min r (21) 

Y=AX+DZ 
rank A>n—p 

= min Aa(X,Y). 

rank A>n—p 

Again, Ap{X, Y) represents the minimum number of error packets needed to produce an output Y given 
an input X under the current adversarial model. The subscript is to emphasize that Ap{X,Y) is still a 
function of p. 

The A-distance induced by Ap{X,Y) is defined below. For € F^^™, let 

Sp{X,X')^ min {Ap{X,Y) + Ap{X',Y)}. (22) 

We now prove that Ap{X,Y) is normal and therefore 6p{C) characterizes the coiTcction capability of 
a code. 

First, observe that, using Lemma |4l we may rewrite Ap{X,Y) as 

Ap(X,Y)= min rank(Y - AX). (23) 

AeFfx": 



Also, note that 



rank A>n—p 



5p{X,X') = mm {Ap{X,Y) + Ap{X',Y)] 
= min jmin rankfy — AX) 

A,yl'eFf X": Y 

r^kA'lr^ +rank(y-A'X')} 

min rank(A'X' - AX) (24) 

A,yl'eF^x": 
rank yl>n— p 
rank A'>n—p 

where the last equality follows from the fact that dR{AX, Y)+dR{A'X', Y) > d^iAX, A'X'), achievable 
by choosing, e.g., Y = AX. 

Theorem 13: The discrepancy function Ap{-, ■) is normal. 

Proof: Let X,X' G F^^™ and let < i < d = 6p{X,X'). Let A, A' G F^^" be a solution 
to the minimization in (l24l) . Then rank (A'X' — AX) = d. By performing a full-rank decomposition of 
A'X' -AX, we can always find two matrices W and W such that W+W = A'X'- AX, rank W = i and 
rank 1^' = d-i. Taking Y = AX + W = AX'-W, we have that Ap{X,Y) < i and Ap(X',y) < d-i. 
Since d < Ap{X, Y) + Ap(X', Y), it follows that Ap(X, Y) = i and Ap(X', Y) = d-i. ■ 
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As a consequence of Theorem \T3\ we have the following result. 

Theorem 14: A code C is guaranteed to correct any t packet errors if and only if 6p{C) > 2t. 

Similarly as in Section |TV-A[ Theorem [T4l shows that 6p{C) is a fundamental parameter characterizing 
the error correction capability of a code in the current model. In contrast to Section IIV-A[ however, the 
expression for Ap{X,Y) (and, consequently, 6p{X,X')) does not seem mathematically appealing since 
it involves a minimization. We now proceed to finding simpler expressions for Ap(X, Y) and 6p{X, X'). 

The minimization in (l23l) is a special case of a more general expression, which we give as follows. 
For X G F^^™, Y G F^^"* and L > max{n - p, N - a}, let 

A^^l(X,Y)= mill rank(BY-AX). 

rank A>n—p 
rank B>N-a 

The quantity defined above is computed in the following lemma. 
Lemma 15: 

Ap,a,L(X,y) = [max{rankX-/), rank y - ct} - dim ((X) n ^. 

Proof: See Appendix iBl ■ 
Note that A.p^cF,L{X,Y) is independent of L, for all valid L. Thus, we may drop the subscript and 
write simply Ap,,(X,y) ^ Ap,,,L(X,y). 

We can now provide a simpler expression for Ap(X, y). 
Theorem 16: 

Ap(X,y) = max{rankX -p, rank y} - dim ((X) n {Y)). 

Proof: This follows immediately from Lemma [TS] by noticing that Ap(X, y) = /S.p^Q{X,Y). ■ 
From Theorem [161 we observe that Ap(X, Y) depends on the matrices X and Y only through their row 

spaces, i.e., only the transmitted and received row spaces have a role in the decoding. Put another way, 

we may say that the channel really accepts an input subspace {X) and delivers an output subspace {Y). 

Thus, all the communication is made via subspace selection. This observation provides a fundamental 

justification for the approach of [7]. 

At this point, it is useful to introduce the following definition. 

Definition 2: The injection distance between subspaces U and V in Vq{m) is defined as 

di{U, V) = max{dim U, dim V} - dim {U n V) (25) 
= dim {U + V) - min{dim U, dim V}. 
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The injection distance can be interpreted as measuring the number of error packets that an adversary 
needs to inject in order to transform an input subspace {X) into an output subspace {Y). This can be 
clearly seen from the fact that di{{X) , {¥)) = Aq{X,Y). Thus, the injection distance is essentially 
equal to the discrepancy Ap{X,Y) when the channel is influenced only by the adversary, i.e., when the 
non-adversarial aspect of the channel (the column-rank deficiency of A) is removed from the problem. 
Note that, in this case, the decoder (l20l) becomes precisely a minimum-injection-distance decoder 

Proposition 17: The injection distance is a metric. 

We delay the proof of Proposition [T7] until Section |V-B[ 

We can now use the definition of the injection distance to simplify the expression for the A-distance. 
Proposition 18: 

6p{X,r) = mX),{X'))-p]+. 

Proof: This follows immediately after realizing that 6p{X,X') = Ap p(X, X'). ■ 
From Proposition [TSl it is clear that 5p(-, •) is a metric if and only if /9 = (in which case it is precisely 

the injection metric). If p > 0, then 6p{-,-) does not satisfy the triangle inequality. 

It is worth noticing that Sp{X,X') = for any two matrices X and X' that share the same row space. 

Thus, any reasonable code C should avoid this situation. 
For C C F^^™, let 

(C) = {{X) : X e C} 

be the subspace code (i.e., a collection of subspaces) consisting of the row spaces of all matrices in C. 
The following corollary of Proposition [18] is immediate. 

Corollary 19: Suppose C is such that \C\ = \ (C) |, i.e., no two codewords of C have the same row 
space. Then 

5p(C) = [di((C))-p]+. 

Using Corollary [T9j we can restate Theorem [14] more simply in terms of the injection distance. 
Theorem 20: A code C is guaranteed to correct t packet errors, under rank deficiency p, if and only 

if di{{C)) >2t + p. 

Note that, due to equality in Corollary [19] a converse is indeed possible in Theorem [20] (contrast with 
Proposition [8] for the coherent case). 

Theorem [20] shows that di{{C)) is a fundamental parameter characterizing the complete correction 
capability (i.e., error correction capability and "rank-deficiency correction" capability) of a code in our 
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noncoherent model. Put another way, we may say that a code C is good for the model of this subsection 
if and only if its subspace version (C) is a good code in the injection metric. 

B. Comparison with the Metric of Kotter and Kschischang 

Let C Vq{m) be a subspace code whose elements have maximum dimension n. In [7], the network 
is modeled as an operator channel that takes in a subspace V € Vq{m) and puts out a possibly different 
subspace U G Vq{m). The kind of disturbance that the channel applies to V is captured by the notions of 
"insertions" and "deletions" of dimensions (represented mathematically using operators), and the degree 
of such a dissimilarity is captured by the subspace distance 

ds{VM) =dim(V + Z^) -dim(VnZ^) 

= dim V + dim^-2dim(VnZi) (26) 

= 6\m{V + U) - dim V- dim Z^. 

The transmitter selects some V € and transmits V over the channel. The receiver receives some 
subspace U and, using a minimum subspace distance decoder, decides that the subspace V C was sent, 
where 

V = argmin cis(V,W). (27) 

ven 

This decoder is guaranteed to correct all disturbances applied by the channel if d^{y,U) < ds(r2)/2, 
where (is(f^) is the minimum subspace distance between all pairs of distinct codewords of Q.. 

First, let us point out that this setup is indeed the same as that of Section IV-AI if we set V = {X), 
U = (y) and = (C), where C is such that \C\ = | (C) |. Also, any disturbance applied by an operator 
channel can be realized by a matrix model, and vice-versa. Thus, the difference between the approach 
of this section and that of [7] lies in the choice of the decoder. 

Indeed, by using Theorem [16] and the definition of subspace distance, we get the following relationship: 

Proposition 21: 

\{X, Y) = \ds{{X) , {Y)) - + i| rank X - rank Y - p\. 

Thus, we can see that when the matrices in C do not all have the same rank (i.e., $7 is a non-constant- 
dimension code), then the decoding rules (l20l ) and (|27] ) may produce different decisions. 

Using /? = in the above proposition (or simply using ( [25l ) and (|26l )) gives us another formula for the 
injection distance: 

di{V,U) = ^ds{V,U) + i[ dim V - dim U\. (28) 
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Fig. 1. Lattice of subspaces in Example |4] Two spaces are joined with a dashed line if one is a subspace of the other. 



We can now prove a result that was postponed in the previous section. 

Theorem 22: The injection distance is a metric. 

Proof: Since ds{-, ■) is a metric on Vq{m) and | • | is a norm on R, it follows from (1281 ) that di{-, •) 
is also a metric on Vq{m). ■ 

We now examine in more detail an example situation where the minimum-subspace-distance decoder 
and the minimum-discrepancy decoder produce different decisions. 

Example 4: For simplicity, assume p = 0. Consider a subspace code that contains two codewords 
Vi = {Xi) and V2 = {X2) such that 7 = dim V2-dim Vi satisfies d/3 < 7 < d/2, where d = ds(Vi, V2). 

Suppose the received subspace U = (Y) is such that Vi C ^ C Vi + V2 and 6\mU = dim Vi + 7 = 
dim V2, as illustrated in Fig.[T] Then d^iyiM) = 7 and ds(V2,^/) = d — 7, while Proposition (|2T] ) gives 
Ap(Xi, y) = 7 and Ap(X2, = (d — 7)/2 = e. Since, by assumption, d — 7 > 7 and e < 7, it follows 
that d^{ViM) < ds{V2,U) but Ap{Xi,Y) > Ap{X2,Y), i.e., the decoders (HTJl and (HOjl will produce 
different decisions. 

This situation can be intuitively explained as follows. The decoder (|27] ) favors the subspace Vi, which 
is closer in subspace distance to U than V2. However, since Vi is low-dimensional, U can only be 
produced from Vi by the insertion of 7 dimensions. The decoder ( |20l ). on the other hand, favors V2, 
which, although farther in subspace distance, can produce U after the replacement of e < 7 dimensions. 
Since one packet error must occur for each inserted or replaced dimension, we conclude that the decoder 
(I20I ) finds the solution that minimizes the number of packet errors observed. ■ 

Remark: The subspace metric of [7] treats insertions and deletions of dimensions (called in [7] "errors" 
and "erasures", respectively) symmetrically. However, depending upon the position of the adversary in 
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the network (namely, if there is a source-destination min-cut between the adversary and the destination) 
then a single error packet may cause the replacement of a dimension (i.e., a simultaneous "error" and 
"erasure" in the terminology of [7]). The injection distance, which is designed to "explain" a received 
subspace with as few error-packet injections as possible, properly accounts for this phenomenon, and 
hence the corresponding decoder produces a different result than a minimum subspace distance decoder. 
If it were possible to restrict the adversary so that each error-packet injection would only cause either 
an insertion or a deletion of a dimension (but not both), then the subspace distance of [7] would indeed 
be appropriate. However, this is not the model considered here. 

Let us now discuss an important fact about the subspace distance for general subspace codes (assuming 
for simplicity that p = 0). The packet error correction capability of a minimum-subspace-distance decoder, 
ts, is not necessarily equal to \{ds{C) — l)/2j or \{ds{C) — 2)/4j, but lies somewhere in between. For 
instance, in the case of a constant-dimension code we have 

di(V,V') = ^ds(V,VO, VV,V'gJ7, 

dm = \dm)- 

Thus, Theorem l20l implies that ts = [{d^{C) — 2)/4j exactly. In other words, in this special case, the 
approach in [7] coincides with that of this paper, and Theorem |20] provides a converse that was missing 
in [7]. On the other hand, suppose is a subspace code consisting of just two codewords, one of which 
is a subspace of the other. Then we have precisely ts = [((is(C) — l)/2j, since + 1 packet-injections 
are needed to get past halfway between the codewords. 

Since no single quantity is known that perfectly describes the packet error correction capability of 
the minimum-subspace-distance decoder (|27] ) for general subspace codes, we cannot provide a definitive 
comparison between decoders ( [27] ) and (|20l ). However, we can still compute bounds for codes that fit 
into Example m 

Example 5: Let us continue with Example |4l Now, we adjoin another codeword V3 = (X3) such 
that ds(Vi, V3) = d and where 7' = dim V3 — dim Vi satisfies d/3 < 7' < d/2. Also we assume that 
ds{V2,V3) is sufficiently large so as not to interfere with the problem (e.g., (is(V2, V3) > 3d/2). 

Let ts and t^ denote the packet error correction capabilities of the decoders (|27] ) and (|20l ). respectively. 
From the argument of Example |4l we get t^^ > max{e, e'}, while t^ < min{e, e'}, where e' = {d — 'y')/2. 
By choosing 7 ?a d/3 and 7' Rr: d/2, we get e d/3 and e' d/A. Thus, > (4/3)ts, i.e., we obtain a 
1/3 increase in error correction capability by using the decoder (|20l ). ■ 
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VI. Conclusion 

We have addressed the problem of error correction in network coding under a worst-case adversarial 
model. We show that certain metrics naturally arise as the fundamental parameter describing the error 
correction capability of a code; namely, the rank metric for coherent network coding, and the injection 
metric for noncoherent network coding. For coherent network coding, the framework based on the rank 
metric essentially subsumes previous analyses and constructions, with the advantage of providing a clear 
separation between the problems of designing a feasible network code and an error-correcting outer 
code. For noncoherent network coding, the injection metric provides a measure of code performance that 
is more precise, when a non-constant-dimension code is used, than the so-called subspace metric. The 
design of general subspace codes for the injection metric, as well as the derivation of bounds, is left as 
an open problem for future research. 

Appendix A 

Detection Capability 

When dealing with communication over an adversarial chaimel, there is little justification to consider 
the possibiUty of error detection. In principle, a code should be designed to be unambiguous (in which 
case error detection is not needed); otherwise, if there is any possibility for ambiguity at the receiver, 
then the adversary will certainly exploit this possibility, leading to a high probability of decoding failure 
(detected error). Still, if a system is such that (a) sequential transmissions are made over the same chaimel, 
(b) there exists a feedback link from the receiver to the transmitter, and (c) the adversary is not able 
to fully exploit the channel at all times, then it might be worth using a code with a lower correction 
capability (but higher rate) that has some ability to detect errors. 

Following classical coding theory, we consider error detection in the presence of a bounded error- 
correcting decoder. More precisely, define a bounded-discrepancy decoder with correction radius t, or 
simply a t-discrepancy-correcting decoder, by 



X if A(x, y) < t and A{x', y) > t for all x' x, x' € C 
f otherwise. 

Of course, when using a t-discrepancy-correcting decoder, we implicitly assume that the code is t- 
discrepancy-correcting. The discrepancy detection capability of a code (under a t-discrepancy-correcting 
decoder) is the maximum value of discrepancy for which the decoder is guaranteed not to make an 
undetected error, i.e., it must return either the correct codeword or the failure symbol /. 



25 



For t G N, let the function : X x X ^ 'N he given by 

a\x,x')= min A{x,y)-1. (29) 

yey-. A(x',y) 

Proposition 23: The discrepancy-detection capabiUty of a code C is given exactly by o-*(C). That is, 
under a t-discrepancy-correction decoder, any discrepancy of magnitude s can be detected if and only if 

S < (T*(C). 

Proof: Let t < s < o-*(C). Suppose that x € X is transmitted and y £ y is received, where 
t < A{x,y) < s. We will show that A{x',y) > t, for all x' € C. Suppose, by way of contradiction, that 
A(x',y) < t, for some x' € C, / x. Then cr*(C) < A{x,y) — l<s — l<s< ^^(C), which is a 
contradiction. 

Conversely, assume that ct*(C) < s, i.e., ct*(C) < s — 1. We will show that an undetected error may 
occur. Since o"*(C) < s — 1, there exist x,x' £ C such that o"*(a;, x') < s — 1. This implies that there exists 
some y £ y such that A(x', y) < t and A(x, y) — 1 < s — 1. By assumption, C is t-discrepancy-correcting, 
so x{y) = x'. Thus, if x is transmitted and y is received, an undetected error will occur, even though 
A(x, y) < s. ■ 

The result above has also been obtained in [12], although with a different notation (in particular, 
treating (J^{x, x') + 1 as a "distance" function). Below, we characterize the detection capability of a code 
in terms of the A-distance. 

Proposition 24: For any code C, we have o"*(C) > 5{C) —t — 1. 

Proof: For any x, x' G A", let y G 3^ be a solution to the minimization in (|29l ). i.e., y is such that 
A(x', y) < t and A(2;, y) = 1 + a\x, x'). Then 5{x, x') < A(x, y) + A{x', y) <1 + a\x, x') + 1, which 
implies that (7*(x, x') < 6{x, x') — t — 1. ■ 

Theorem 25: Suppose that A(-, •) is normal. For every code C C ^, we have crf(C) = 6{C) — t — I. 
Proof: We just need to show that cr*(C) < 6{C) — t — I. Take any x,x' G X. Since A(-,-) is 
normal, there exists some y £ y such that A{x',y) = t and A{x,y) = 5{x,x') — t. Thus, a^{x,x') < 
A{x,y) - 1 = 6{x,x') -t - 1. m 

Appendix B 
Proof of Lemma [15] 

First, we recall the following useful result shown in [8, Proposition 2]. Let X,Y £ ¥^^^. Then 

rank(X -y) > max{rankX, rank y} - dim ((X) n {¥)). (30) 
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Proof of Lemma 175} Using (l30l ) and (|3]l, we have 

rs^nk{AX -BY) > max{rank ylX, rsnk BY] - d\m{{AX) r\ {BY)) 

> max{rank X - p, rank Y - a} - d\m{{X) n (Y)). 

We will now show that this lower bound is achievable. Our approach will be to construct A as 
A = A1A2, where Ai € ^^^^^^^^ and A2 € F^^''''')^" are both full-rank matrices. Then (O guarantees 
that rank A > n — p. The matrix B will be constructed similarly: B = B1B2, where Bi € 



and B2 G F, 



{L+a)xN 



are both full-rank. 



Let /t = rankX, s = rankF, and = d\m{{X)n{Y)).LetW £ F^f^'^be such that {W) = {X)n{Y), 



let X G F, 



(A;— u)) xm 



be such that {W) + (X) = (X) and let F G F, 



{s—w)xm 



be such that (W) + (Y) = (Y). 



Then, let A2 and i?2 be such that 



AoX 



W 
X 




and B9Y 



W 
Y 




Now, choose any A G and S G F^''^'""'^ that have full row rank, where i = [k — w — p]"^ 



and j = [s — w — . For instance, we may pick A 



I 



and B 



I 



. Finally, let 



10 
^1 = i and Si 
0/0 

where, in both cases, the upper identity matrix is w x w. 
We have 



10 
S 
0/0 



rank(ylX - BY) = rank(^i^2X - B1B2Y) 





W 




W 


= rank( 


Ax 




BY 












= max{i 








= max{A; — w 


- p, 


s — 1 


= [max{A; — p 


, s - 


a}- 
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